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(54) telephone speech recognition 



function analyzes a call received from thetolMhMVM«Vf«?? • 9 " e connect,0n d «a acquisition 
information of the cS and ££m£?ho£ ffiS&" Q ?j£E^ C ° UMfy - rOUte ' and olner 
connection data processor 2 one?f tiSSSS^^i^TST^ *" 1™**™ *' The 

data from the interface 1 and a(so one of the ^c^S^ZVjT^l LT^ 0 " ne co ™ e «>°" 
compares an acoustic vector train output o^^^^^^^^ » an * rn " 
the selected reference speech mode, storage for speech recogn Uon m ° de ' S 9iven ,rom 
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TELEPHONE SPEECH Jcrvrv 2 
EECH RECOGNITION SYSTEM 



7716 prese * invention relate* » 
«~» an, Pa , cubf|y , a (e( ^ ^ speed, fecogn| , m 

A conventional SDeeeh 

^n smitted via a tet t an<1 h « to be identic 

,a a te '«P*>one lina Interfax n ■ ""wWied, it i s 

SyS,em ^co„ ven8ona , ■ Ce3,tothe ^ f eco gn(fon 

an acoustic anaiyzer 32. a " ^ 

a -^nce speech mode, ^ ^ ^ "*cner 33. and 

^ speech of the caller int™. 
* ~ M to the ecou^T ^ - 

(ma, on «„e b asts c a Ha 10 
to accuse a n a ly5is ^^1"! ^ * ** » - « 
acoustic vector w, ^ „ eePStn " n to produce an 

- HMMs (hidden Mart(ov 34 saves speech models S1K , 

^ acoustic vector tra in ^ ft ^ Batt9m 33 
^-ce speech mode , _ "» ^ ™0e b saved „ ^ 

■~ „ speeeh m^rr' a — *. of 

«— as cutouts 0, the speech « 

to toe conventional manner 
— « re 9artless of , he ^ * ~ racoons „ h0WBvw 

■™ connection^ a ra(terat 
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. other en. of the line to me telephone „„. ^ ^ ^ ^ 

c he speech recognition syS te m . More specify, tne ^ action 

of ,he speech pattern matcher 33 with the speech mode,s save, in the 

reference speech mode, storage 34 is execute. without a „ y Qf 

me conditions of line connect Accordingly, when partieu(ar 

from the telephone ,i ne ^ invo)ve() or thef6 „ , ^ ^ 

^uency characteristic between the conventfona, telephone line and . 
»ne use. with a system such as a mobile tetephone set, a .esire. ,ev*l of 
speech recognition will hardly be accomplished. 

I' is particularly difficult for such a conventional speech 
recognibon system to recognfee voice date in a call received from the 
•ntemationa, telephone line which vanes depending on terminal and line 
systems of a country. 

It is therefore desirable to provide a telephone speech 
recognition system capable of performing a high level of speech 
recognition without bel™, affects by various conditions of a telepW 
lino «hile aliadMtiog the foregoing drawbacks of the prior art. 

according to the p^ent there is pn,^ 

a telephone speech recognition system for recognizing speech date 
received from a telephone line ^riaiog: a telephone 1™ interface 
connected to the telephone line for detecting tine connection date, 
a plurality of acoustic analyze,* having mans for remw|[)g ^ ^ 
from the line characteristics and/or the route characteristics, and a line 
connection data processor responsive to the line connection data from the 
telephone line interface for selecting the acoustic ana^ers. ,„ addition 
a plurality of reference speech mode, storages for saving speech models 
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• ; • 

may be provided. ^sracteristics 
* te tePhone speech recogm,^ 

-r -i- ta r vm9,he ~ 

enhanc„ g the qua|ity „ ^ ~ «P-«h mode, thus 

of example wi+-h t*,*~ **=»enjied by way 

* 1 iS 3 b,0 <* "*Wm showing a first emh „• 
1 0 Present invention; embodiment of the 

*9. 2 is a block diagram showing 
present invention; and embodiment of the 

*9 3 * * diagram showing a convention,, 
recognition system. on ventional speech 

Embodiments of the invention wil, be describe , , 
reference to the aescnbed in detail with 

e at3Qa «TOySiig drawings. 



2 0 



embodiment of the present , " 3 ^ ™ e 

As shown, a line interface 1 havi no a l 
connection data acquis* * 1 ^ • known function of line 

acqu,srt,on is connected to a telephone *» 
a switching board The k * 3 net * 0fk or 

77)6 te,e Phone fine interface 1 with th» i. 
connection data acquisition 06 

acquisition function examines line conn-,* 
a received call including the Wnh " ° f 

„ . 9thete,epnone n"rnberof.acaHer 
the ,„ terconneaion jn a pnvate branch -r. 

- ^ an int e matton ., ^ the * < ^ - (-n .e 

" er an " the route of 
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transmission (e.g. via satellite linKs or underwater cable links) The line 
connection data a extracted by the telephone line interface , wtth line 
connection data acquisition function „ transmitted to . „ ne 

data processor 2 while a speech data b of the call is given ,o a firs. 
3 switching unit 3. 

-n response to the line connect data a from the telephone line 
interface 1 with line connection data acquisition function, the „ ne 
connection data processor 2 actuates the fir* switching un(t 3 and 
a second switching unit a to select either a firs. 4 or a second acoustic 
' • analyzer 5. Each 0 f the ft* 4 and tne ^ g 

separates the speech data a. equ a , intervals of substantially 10 ms on 
the bas,s of a Hamming window of about 25 ms and subjeas its data 
segments .0 acoustic analysis such » cepsfrum analysis te Preduce 
a train of acoustic vectors. 

It is now noted that the apeech may be free from or contain 
a noise in a particular frequency range of voice signal depending on 
the route of transmission or the counfry of the caller, tor exampte any call 
*om a specific nation in Europe carries such a nofee. For handling 
former and *, tetter , ^ firs , 4 and ^ ^ g 

2 0 respectively are connected in parallel for selective use. 

The firs, acoustic analyzer 4 ana^zes the speech which contains 
no such noise. The second acouaac analyzer S has a notch filtar or 
tha like for removing me noise «p produce a acoustic vector .rain from *. 
ncse-free speech. The embodimen. is no, limited .0 the two acoustic 
. . ana^zers 4 and 5 shown and three or more acoushc analyzers may be 
used when three or more noise-imposed speech data are received 
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"There are provided three f? rC f 
^ mode, storey 7 , 0 „ ; • « »* Terence 

3 *" - *. connect dala ^ S ~e to . Mn[re( ^ 

1 ■ The speech pattem ^ , , ^ * - ~ — ■ 
«~ - *m the selected ^ " ' »» a speech 

«— . ~ctor train c traramftted m ^ ™« **h the 

' Pee * re -9n«o„ and deHvefs ^ J fc "* ^ ^,n g un(t 6 for 

'n ^e embodiment the m ne( 

~ «. stora3e ra „ ..-^r — * «^ and 

*" <ata inching me rf ~™*P«ndi„ g to toe « ne connec . 

<«epHone llne and thB nofee to *• character^ „ ^ 

me speech reco^n ^J^^ - . o f 

- — mode, storages /„ g ~ ~ — * *e «. to 
<-"*0. TO> " ,e,n0reased 'n<.ua %through 

A second embody, of 

*»e ri bad nsfeWngt0F ""^^.a^^ 

^ The second omK^- 
* me ^ embodjITlen( fs 

"more detail. ^ wnreh are then explained 

In particular, the first 12 and th 
"save noise models A speech ^ ^ ^ 3.ora ge 

— -^sentrzirr 3 ^^--- 

8 >PMeh —■""on to correctly ..... 
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discriminate me silence pause from , he other speeoh ^ 
an .ntrinsic noise is sometimes imposed onthe sitence pause when , he ' 
can travels through a pabular roule . For identification of such intrinsic 
ncses. their modeis are saved in the first ,2 and the second speech 

from the fine connection data iprocessor 2 to selecUve* connect either the 
first 12 or .he second speech mode, storage 13 to the speech pattern 
matcher 1 1 ^ « hird speeoh ^ ^ ^ ^ ^ ^ 

for use rag ardte ss of the country of a oailer and/or the route of 
transmission. Mo re spedfieafiy. me speeoh modeis saved in me third 
speech mode. sterage ,4 are .demfca, ,o mose, e.g. MMMs, saved in a 

speech model storage 34 shown in Rg. 3 The soeeoh 

ne speech pattern matcher 
11 .dentifies the si,ence pause in a speech from the noise mode* 
su P p,.ed through the third swfching unit 10 and collates me speech with 
the speech mode, from the mint speech mode, storage 14 to recognize 

voice sounds in the speech A result nfth^ 

peecn. a result of the speech recognition is then 

delivered as an output. 



i o 
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The first 12 and the second speech model storage 13 «,„ 
improve the no.se models through teaming. ,„ pra c iee . me speech date 
received from a telephone line is saved in memory means and when me 
telephone fine is disconnected, is ted to the fira, 4 or me second acousfic 
anatyzer S for extracting its noise data which is then saved .„ mis 
manner of teaming, the noise modete saved in the first ,2 and me second 
speech model storage 13 can he improved. 

While the speech mode, sterage 14 saves me speech models for 
use regardless o, me country of a eatter and/o, the route of transmission 
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in previous embn .. * 7 " 

emb °d'ment and is nf ■■ 

— co rresponding ' «- * - ino various speech 

, - - - - se tected „ ^ «— « - , otes so (M 
*• Passer 2 . 

Th e action of the secann 

. o r - and/ ° r — « 

' 2 * by ae ^ 8 . J" ^ ^ storage 

Th e speech data fm m 
"hp**,, ^ te,eph0 "«»ne Interfax 

- ~— ~ mode , ^ - - — • ~» saved 
' fthe ^"<'5 received fm m „. 
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the specific country or via the particular route fae 
•Ne speech recognition with the most ^ ^ ^ ^ 

ana*zer and me speech mode,, whereby the quality of speech 
recognition of the data will be increased. 

AS ~ f ° rth ab0 ^ *° allows the acoustic analy2ef 

and me reference speech ^^el storage ,o be seteced according W 
me country ofa ca, ter an d,or me route of ban^, m US enhancing 

me „u a ,«y of speech recognition. When a not-continuous insertion signa. 
- «M in a cat, from the pub„e boom in a specific counby, , « * 
etiminated by setecbng and using desired one of the acoustic angers 
in a domestic use. a call from a mobite tel ephone is identified by its 
Ration number and can be subjected to the speech recognition with 

use o, speech modets for mobite tefephone. thus having a higher ^ 
of speech recognition. 

S^odiirents of the present ^ aUc, ^ llne ^ 

__ . tne xme connection data 

™ ir ^ a carLer and/or the route 
of transmission, to be used for seWn„~ «. 

w selecting the acoustic 

^ analyzer and the 

speech models. Acco-dingy, the speech recpgnftion is performed in 
reference with ,he tine characteristics and the noise character^ and 
will be increased in the quality. 
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SDep h w 1 ' A t6,ePh0ne cognition System fe 

^ch data receive, from a teiephone ~ * 

«— conned data; ^ t0 ^ * for 

9 plurality of acoustic analyzers h 
00iSe at te ast one of nT^T" ~* 

• "ne connection data process 
-alyzers in response to * „ ne * acoustic 

' interface. the telephone ,i„ e 

2 - A telephone speech rec 
comprising; ° 9n *° n Sys ^ according to claim 

a Plurality of reference sd^k 
^ toatw ^P*** model storages for savinn 

j£asta **aetJi i Li^ ctetaa^ Sav ' n 9 speech 

«<*u,ed tothetetephonei . ne St0ra9eSSa "^P«c f , TOJ<Jels 
4 A telephone speech r»™ • 

' can be reco^ea by a SD * * "" y *- In 

noise moBels . ^""^^^ferhngtothe 
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5. A telephone speech recognition system according ,o claim 3 
or 4. herein .he speech mode* saved in the reference speech mode, 
storages are updated by learning. 

6. Telephone speech recognition system as hereinbefore 

described with reference to Fidures ihw* 

ce jo ngures 1 to 3 of the accompanying drawings. 
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